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Abstract 

In  another  paper  the  first  author  presented  an  asymptotic  result  concerning  the 
Mumford-Shah  functional, 

E(f,T)  =  (3  f(f- g)2+  [  | V/|2  +  a  length(r), 

J  n  Jn\r 

showing  that  if  g  is  approximately  piecewise  smooth  and  j3  is  sufficiently  large  then  T 
which  minimize  E  will  be  close  to  the  discontinuity  set  of  g  in  the  sense  of  Hausdorff 
metric.  In  this  paper  an  algorithm  is  presented  which  implements  the  scaling  sug¬ 
gested  by  the  asymptotic  results  to  obtain  accurate  localization  of  edges  even  when 
only  large  scale  edges  are  being  detected.  An  approximation  to  E  is  also  considered 
and  the  implications  of  the  asymptotic  results  to  this  functional  are  also  examined. 
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1  Introduction 

Edge  detection  is  regarded  as  one  of  the  fundamental  components  of  computer  vision. 
An  issue  which  arises  in  all  edge  detection  schemes  is  that  of  scale.  One  reason  for 
this  is  the  basic  insight  which  says  that  coarse  (i.e.  large  scale)  representations  are 
less  complex  than  more  detailed  ones.  Coarse  scale  descriptions  can  effectively  be 
used  to  select  “regions  of  interest”  for  detailed  processing,  thus  reducing  the  demand 
on  computational  resources.  It  is  important  therefore  that  coarse  scale  descriptions 
retain  those  features  of  the  data  which  are  required  for  effective  decision  making.  In 
the  case  of  edge  detection  for  example,  T-junctions  and  corners  play  important  roles 
in  estimating  the  depth  and  the  shape  of  objects  in  a  scene.  It  is  desirable  therefore 
that  even  at  coarse  scales  these  features  be  accurately  represented.  The  purpose  of 
this  paper  is  to  present  (without  proofs)  certain  analytical  results  on  the  Mumford- 
Shah  functional  related  to  its  scaling  properties,  and  then  to  show  how  these  results 
support  the  use  of  certain  approaches  to  edge  detection. 

By  edge  detection  we  mean  the  problem  of  location  step  discontinuities  in  an 
otherwise  smooth  function,  perhaps  in  the  presence  of  noise  and  smearing.  There 
are  many  methods  which  have  been  proposed  for  detecting  edges  of  this  type.  Two 
classes  which  comprise  a  majority  of  these  methods  are: 

1.  Techniques  which  consist  of  linear  filtering  followed  by  some  non-linear 
operation  such  as  the  locating  of  the  maxima  of  gradients. 

2.  Techniques  which  combine  the  smoothing  and  nonlinear  operation  into  a 
single  formulation  or  process. 

Well-known  example  from  the  first  class  include  the  Marr-Hildreth  edge  detector  [25] 
and  the  Canny  edge  detector  [8].  Within  the  second  class  we  find  Markov  Random 
Field  formulations  [17]  [26]  [14],  the  Variational  formulation  [7]  [28]  [29],  and  Non¬ 
linear  Diffusion  approaches  [30].  Historically  the  first  class  preceded  the  second.  The 
central  idea  the  second  class  added  to  those  of  the  first  class  was  that  of  performing 
simultaneous  edge  detection  and  smoothing  rather  than  the  previously  used  two  step 
procedures.  The  motivation  for  this  was  the  fact  that  the  smoothing  procedures  blur 


1  INTRODUCTION 


3 


across  boundaries  obscuring  and  distorting  the  edges,  especially  at  high  curvature 
locations  such  as  corners  and  T-junctions.  The  three  approaches  which  we  mentioned 
as  belonging  to  the  second  class  can  all  be  related  to  an  optimization  problem;  the 
minimization  of  some  functional  typically  having  three  terms.  One  example,  the 
point  of  reference  for  this  paper,  is  the  Mumford-Shah  functional  associated  with  the 
variational  formulation; 

E(f,  T)  =  p  f  (/  -  gf  +  /  | V/|2  +  a  length (T). 

Here  g  represents  the  data,  which  we  will  think  of  as  a  real  valued  function.  The  sym¬ 
bol  T  denotes  the  set  of  edges  in  the  image  and  /  is  a  piecewise  smooth  approximation 
to  g.  The  problem  is  to  minimize  E  over  admissible  /  and  T.  The  parameters  (3  and 
a  control  the  competition  between  the  terms;  they  decide  the  “scale”  of  the  edge 
detection.  The  the  second  term  of  E  provides  for  the  interaction  between  the  edges 
and  the  smoothing  mechanism  by  allowing  T  to  control  or  modulate  the  smoothness 
constraint  on  /.  It  was  expected,  and  demonstrated  to  some  extent  in  [7]  that  this 
approach  better  localizes  the  edges  than  the  methods  of  class  1.  However,  analysis  of 
optimality  conditions  revealed  that  optimal  T  can  have  only  certain  restricted  types 
of  local  geometries,  implying  in  particular  that  T-junctions  and  corners  tend  to  be 
distorted  [29].  On  the  other  hand  a  result  of  the  first  author  showed  that  these  con¬ 
straints  were  truly  of  a  local  nature  and  that  globally  the  variational  approach  should 
produce  reasonable  solutions.  The  result  is  an  asymptotic  one  stating  that  as  f3  — *  oo 
optimal  T  will  converge  in  an  appropriate  sense  to  the  discontinuity  set  of  g  provided 
the  noise  and  smearing  are  removed  from  the  data  sufficiently  quickly.  This,  then,  is  a 
kind  of  fidelity  result  for  the  variational  formulation.  Furthermore  the  results  suggest- 
a  principle  whereby  it  may  be  possible  to  recover  coarse  scale  edges  alone  with  the 
localization  accuracy  usually  available  only  on  the  finest  scale. 

An  algorithm  is  developed  which  on  a  small  time  scale  resembles  the  minimizing  of 
the  energy  functional  but  on  a  longer  time  scale  evolves  by  changing  the  parameters 
of  the  functional  in  the  direction  of  the  aforementioned  limit.  In  order  to  prevent  the 
resolving  of  microscopic  detail  as  the  limit  is  taken,  i.e.  in  order  to  retain  only  large 
scale  features,  we  systematically  remove  small  scale  features  as  if  they  were  a  distor- 
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tion  of  the  ideal  image.  The  rates  and  topological  structure  for  this  removal  (which 
is  achieved  by  edge  location  dependent  smoothing)  are  governed  by  the  convergence 
theorem  mentioned  above. 

The  implementation  of  the  variational  formulation  itself  presents  several  difficul¬ 
ties.  It  turns  out  that  there  is  a  method  for  approximating  E,  using  the  concept  of 
T  convergence  that  allows  for  a  straight-forward  implementation  (a  gradient  descent 
method).  Furthermore  we  can  argue  that  this  approximation  may  in  some  sense  be 
superior  to  the  original  one  in  the  manner  in  which  it  distorts  the  singularities  i.e. 
the  T-junctions  and  the  corners,  and  this  argument  is  based  on  the  principle  referred 
to  above. 


2  Summary  and  Outline  of  the  Paper 

For  the  sake  of  compactness  we  have  chosen  to  suppress  most  of  the  mathematical 
results.  In  Section  3  some  of  what  is  known  about  the  fundamental  question  of  exis¬ 
tence  of  miniinizers  to  the  (continuous)  variational  problem  is  reviewed.  Some  of  the 
implications  of  the  formulation  on  the  structure  of  minimal  boundaries  is  presented. 
These  results  help  to  motivate  our  work  which  aims  to  circumvent  those  constraints 
and  to  demonstrate  their  local  nature.  We  view  these  constraints  as  undesirable 
structural  restrictions  placed  on  solutions  by  ad  hoc  choices  in  the  formulation  of  the 
energy  functional.  What  the  functional  offers  in  return  is  a  relatively  simple  structure 
that  admits  analysis. 

Section  4  summarizes  our  main  contributions  to  the  analytical  understanding  of 
the  variational  formulation  of  the  segmentation  problem.  The  results  demonstrate  an_ 
asymptotic  fidelity  of  the  variational  approach.  The  ideas  inherent  in  these  results 
also  serve  as  the  primary  justification  and  motivation  for  the  algorithm.  To  obtain 
these  results  we  assume  that  the  image  is  a  corrupted  version  of  piecewise  constant 
or  piecewise  smooth  function,  depending  on  the  particular  problem  formulation.  The 
results  state  that  asymptotically  as  f3  — *■  oo  the  boundaries  given  as  a  solution  to 
the  variational  problem  converge  (in  Hausdorff  metric)  to  the  discontinuity  set  of  the 
underlying  image.  These  results  can  be  thought  of  as  a  counterpoint  to  the  results 
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of  the  calculus  of  variations.  The  results  of  the  calculus  of  variations  imply  that  the 
minimizers  have  certain  local  structure  which  from  the  image  processing  point  of  view 
may  be  undesirable.  The  limit  theorems  say  that  when  viewed  globally  the  solutions 
behave  well  and  asymptotically  essentially  any  structure  i.e.  any  boundary  geometry, 
can  be  recovered. 

In  Section  5  we  sketch  an  algorithm.  The  main  ideas  on  which  the  algorithm  is 
based  are  derived  from  the  limit  theorems.  The  structure  of  the  algorithms  closely 
resembles  that  of  the  limit  theorems;  in  fact  the  limit  theorems  can  be  interpreted  as 
consistency  results  for  the  algorithms. 

Section  6  is  devoted  to  presenting  the  T-convergent  approximation  to  E.  Some 
basic  properties  of  solutions  to  this  approximation  are  stated.  In  Section  7  the  details 
of  the  computation  are  given.  In  Section  8  we  present  the  results  of  some  experiments. 


3  The  Variational  Model  for  Edge  Detection 

Three  techniques  for  image  segmentation  and  reconstruction  based  on  intensity  in¬ 
formation  which  have  recently  gained  considerable  attention  are  Markov  Random 
Fields,  Variational  Formulations,  and  Non-linear  Filtering.  Most  researchers  in  this 
area  have  realized  that  these  methods  are  closely  connected  (see  [15]  or  [30]);  the 
practical  differences  lying  mostly  in  the  conception  of  the  computation  to  be  carried 
out.  The  essential  feature  which  these  models  are  designed  to  capture;  simultaneous 
smoothing  and  edge  enhancement /boundary  detection,  is  achieved  in  essentially  the 
same  way.  Our  work  is  connected  with  these  methods,  it  is  most  convenient  to  relate 
it  to  the  Variational  formulation. 

The  Variational  formulation  models  edge  detection  as  the  minimization  of  an 
energy  functional.  The  functional  introduced  by  Mumford  and  Shah,  [28]  [29],  and 
(in  discrete  form)  referred  to  as  the  weak  membrane  by  Blake  and  Zisserman  [7]  is 
the  following, 

E(f,  T)  —  f3  f  {f  —  g)2  +  f  | V/|2  +  a  length(r) 

Jo  J o\r 

where  a  and  (3  are  positive  real  scalars  (the  parameters  of  the  problem)  and  /  is  a 
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piecewise  smooth  approximation  to  g  having  discontinuities  only  on  the  set  T  which 
one  interprets  as  the  edges  found  in  the  image.  The  first  term  of  E  penalizes  the 
fidelity  of  the  approximating  image  /  to  the  data  g.  The  second  term  imposes  some 
smoothness  on  /.  The  third  term  penalizes  the  total  length  of  the  boundary  (which 
we  think  of  as  the  union  of  curves).  The  removal  of  any  term  results  in  trivial  solutions 
yet  with  all  three  terms  the  functional  captures  in  a  simple  way  the  desired  properties 
of  a  segment  at  ion/ approximation  by  piecewise  smooth  functions. 

The  parameters  /?  and  a  have  to  be  chosen.  Since  we  have  not  fixed  them  a  priori 
we  have  really  defined  a  two  dimensional  space  of  functionals.  It  is  of  interest  to 
examine  certain  limiting  versions  of  the  functional. 

Consider  allowing  j3  and  a  to  tend  to  zero  while  keeping  their  ratio  fixed.  Relative 
to  the  other  terms  the  smoothing  term  would  dominate.  Clearly  any  limit  of  mini- 
mizers  would  necessarily  be  a  locally  constant  function  on  0\T  (where  T  would  be  the 
limiting  boundaries.)  Mumford  and  Shah  were  thus  lead  to  introduce  the  following 
functional, 

E0(f,  H  =  W  (fi  -  gf  +  a  length(r) 

i 

where  T  =  0\  U*  fi;  and  the  /;  are  constants.  This  functional,  because  of  its  greater 
simplicity  lends  itself  to  more  thorough  analysis. 

The  energy  functional  associated  with  the  variational  model  is  ad  hoc.  In  [29]  the 
structure  of  minimizers  of  the  functional  was  studied  via  the  calculus  of  variations.  A 
summary  of  a  few  of  the  results  on  the  structure  of  minimal  T  is  presented  here  since 
this  paper  is  partially  motivated  be  the  desire  to  circumvent  some  of  these  constraints. 
The  following  constraints  on  F’s  which  minimize  E  were  proved  by  Mumford  and  Shah 
in  [29].  They  are  illustrated  in  Figure  1. 

•  If  r  is  composed  of  C1’1  arcs  then  at  most  three  arcs  can  meet  at  a  single  point 
and  they  do  so  at  120°. 

•  If  T  is  composed  of  C1’1  arcs  then  they  meet  dS 2  only  at  an  angle  of  90°. 

•  If  T  is  composed  of  C1'1  arcs  then  it  never  occurs  that  two  arcs  meet  at  an  angle 
other  than  180°. 
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Non-minimal  Geometries 


Corresponding  Minimal  Geometries 


Figure  1:  Calculus  of  Variations  Results 
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•  If  x  G  T  and  in  a  neighborhood  of  x,  T  is  the  graph  of  a  C2  function  then 
(/2(/  —  #)2  +  |V/|2)+  — (/?(/  — p)2  +  |V/|2)~  +  ctcurv(r)  =  0  where  the  superscripts 
-f  and  —  denote  the  upper  and  lower  trace  of  the  associated  function  on  T  at  * 
and  curv(T)  denotes  the  curvature  of  T  at  x. 

The  difficulty  with  these  results  is  that  they  do  not  support  the  use  of  the  varia¬ 
tional  approach  as  an  image  segmenting  scheme  with  respect  to  the  goal  of  obtaining 
intuitively  appealing  segmentations.  In  particular  T-junctions,  believed  to  be  impor¬ 
tant  image  features,  tend  to  be  distorted  and  corners  tend  to  be  rounded  out.  The 
restrictions  on  the  geometry  of  the  edges  arises  out  of  the  model  and  are  artifacts  of 
the  particular  formulation  and  do  not  reflect  an  intrinsic  property  of  the  problem  at 
hand.  How  then  can  one  improve  upon  such  as  hoc  models  ?  One  idea  is  the  follow¬ 
ing.  Consider  the  set  of  all  possible  minimizers  of  the  functional  E,  over  all  possible 
values  of  the  parameters.  Each  of  these  minimizers  possess  the  properties  which  the 
model  imposes.  However,  if  we  take  the  closure  of  these  functions  in  an  appropriate 
topology  we  may  widen  the  class  of  functions  considerably.  The  asymptotic  theorems 
outline  in  Section  4  indicate  that  particular  meaningful  members  of  such  a  closure 
may  be  found  by  taking  the  parameters  associated  with  the  functional  to  certain 
limits.  In  fact  in  this  way  one  can  produce  essentially  any  piecewise  smooth  function 
with  edges  having  arbitrary  geometries.  An  idea  which  clearly  presents  itself  is  to 
develop  an  algorithm  in  which  the  same  limit  is  taken.  This  is  one  of  the  purposes  of 
this  paper. 

In  all  known  segmentation/edge  detection  schemes  there  exist  parameters  which 
can  be  related  in  some  sense  to  “scale”.  There  is  no  generally  accepted  definition  of 
“scale”  but  often  one  has  notions  of  size,  contrast  and  geometry  in  mind.  Consider," 
for  example,  edge  detection  techniques  based  on  convolution  of  the  image  with  Gaus¬ 
sian  kernels  followed  by  detection  of  gradient  maxima  [24]  [25].  Here  the  relevant 
parameter  is  cr,  the  “variance”  of  the  kernel.  For  each  value  of  cr  one  obtains  a  dif¬ 
ferent  set  of  edges.  Ideally  one  would  hope  that  as  cr  increases  that  the  set  of  edges 
would  decrease  monotonically.  However,  this  is  not  the  case;  in  general  the  edges 
drift  as  the  scale  varies.  Since  on  the  finest  scales  it  is  desirable  to  know  which  edges 
correspond  to  gross  features  in  the  image  there  arises  the  problem  of  finding  within 
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the  small  scale  edges  those  that  correspond  with  large  scale  features.  It  is  true  that 
the  edge  sets  vary  continuously  as  a  function  of  the  parameter  cr  but  in  general  it  is 
computationally  too  costly  to  compute  boundaries  for  sufficiently  dense  a  set  of  a  to 
make  the  tracking  obvious. 

In  the  energy  base  formulations  there  are  usually  2  free  parameters  associated  with 
the  problem.  In  this  sense  the  use  of  the  word  “scale”  is  misleading.  When  Blake 
an  Zisserman  [7]  speak  of  varying  the  scale  of  the  problem  they  consider  varying  the 
coefficient  on  the  smoothing  term  in  E  (which  is  set  to  1  in  our  formulation.)  This 
is  equivalent  to  varying  a  and  (3  while  keeping  their  ratio  fixed.  While  -K=  can  be 

V* 

interpreted  as  the  analog  to  cr,  keeping  the  ratio  smoothing  conception  of  “scale” 
since  this  has  the  effect  of  keeping  the  total  quantity  of  boundary  and  the  localization 


errors  roughly  constant.  In  our  limit  theorems  we  (usually)  keep  a  fixed  and  let 
(3  tend  to  oo.  Thus  for  a  fixed  a  “scale”  can  be  thought  of  as  proportional  to  -^==. 
However,  in  general  there  are  two  parameters  and  these  parameters  describe  the  range 
of  functionals  under  consideration.  In  [36]  the  significance  of  the  two  parameters  is 
studied  in  the  context  of  the  1-dimensional  segmentation  problem. 


3.1  Existence  Results 

For  the  functional, 

Eo  =  f3  Y,  f  ( fi  ~  9  f  +  length(T) 

i  i/fij 

(where  the  fi,'  are  the  connected  components  of  fi\T  and  the  /;  are  constants),  the 
following  has  been  proved. 

Theorem  1  [27]  Let  fi  be  an  open  rectangle  and  let  g  €E  T°°(fi).  For  all  one¬ 
dimensional  sets  T  C  fi  such  that  T  U  <9fi  is  made  up  of  a  finite  number  of  C 1,1  - 
arcs,  meeting  each  other  only  at  their  end-points,  and,  for  all  locally  constant  func¬ 
tions  f  on  fi\T,  there  exists  an  f  and  a  T  ivhich  minimize  Eq. 

Mumford  and  Shah  [29]  proved  a  similar  theorem  with  the  restriction  that  g  be 
continuous  of  fi.  In  this  case  they  showed  that  T  is  composed  of  a  finite  number  of 
C7  curves.  The  proof  relied  heavily  on  results  from  geometric  measure  theory.  The 
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theorem  quoted  above  was  proved  by  Morel  and  Solimini  [27]  using  direct,  constructive 
methods.  Finally,  another  proof  using  T  restricted  to  be  unions  of  line  segments  and 
then  taking  limits  as  the  segment  lengths  tend  to  zero  was  achieved  by  Y.  Wang  [37]. 

To  formally  state  the  best  known  existence  theorem  for  E  we  will  require  a  little 
more  notation.  It  is  still  an  open  problem  to  show  existence  of  minima  for  the  func¬ 
tional  E  with  sufficient  regularity  of  the  boundary  to  allow  the  analysis  of  Mumford 
and  Shah  [29]  to  go  through.  The  best  available  existence  results  necessarily  allows 
the  boundary  T  to  be  sufficiently  irregular  that  ‘length’  cannot  be  defined  for  it.  This 
measure  is  therefore  replaced  with  a  more  general  measure. 


3.1.1  Hausdorff  Measure 


A  curve  T  C  5Rn  is  the  image  of  a  continuous  injection  g  :  [0, 1]  — +  9?n.  The  length  of 
a  curve  T  is  defined  as 

771 

L(T)  =  sup(X;  ||flf(*i)  -  0(<t-i)||  :  0  =  to  <  h  <  •  •  •  <  tm  =  1} 

i- 1 

and  T  is  said  to  be  rectifiable  if  T(T)  <  oo. 

For  a  non-empty  subset  A  of  5in,  the  diameter  of  A  is  defined  by  diam(A)  = 
sup{||a:  —  t/||  :  x,y  £  A}.  Define 

oo  oo 

U\(A)  =  inf{£diam {Ui)3  :  A  C  \J  Uh  diam (Ui)  <  8}, 

t=i  i=i 

The  Hausdorff  1-dimensional  measure  of  A  is  then  given  by 


Til{A)  =  hm7fj(A)  =  sup7f^(A) 

s~t0  6>  o 


Many  properties  of  Hausdorff  measure  can  be  found  in  [11,  12,  32].  The  following 
theorem  states  that  TLl  is  a  generalization  of  length,  as  required. 


Theorem  2  IfV  C  is  a  curve,  then  TLl{V)  —  T(r). 


Proof  See  [11]  Lemma  3.2. 


□ 
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The  following  theorem  is  a  structure  theorem  for  closed  sets  of  finite  H1  measure. 
We  refer  to  a  compact  connected  set  as  continuum. 

Theorem  3  If  T  is  a  continuum  with  H1^)  <  oo,  then  T  consists  of  a  countable 
union  of  rectifiable  curves  together  with  a  set  of  Tt1  -measure  zero. 

Proof  See  [11],  Theorem  3.14.  □ 

A  natural  “weak”  formulation  of  the  variational  principal  is  thus  the  following, 

E(f,  T)=fi  jig  -  ff  +  /  || V/||2  +  H\Y)  (1) 

where  T  being  a  relatively  closed  subset  of  fi  and  /  (E  W1,2(fi\r)  where  W1’2  is  the 
Sobolev  space  as  defined  in  [1].  An  existence  result  now  exists  for  this  formulation  due 
to  results  of  Ambrosio  [2]  [3]  [4]  and  DeGiorgi-Carriero-Leaci  [10]  but  it  is  beyond 
the  scope  of  this  paper  to  describe  these  results. 

4  The  Asymptotic  Theorems 

In  this  section  we  state  the  limit  theorems.  The  proofs  are  beyond  the  scope  of  this 
paper  and  will  be  published  elsewhere  (they  may  be  found  in  [31]).  The  theorems 
are  concerned  with  what  happens  to  solutions  of  the  variational  formulation  of  the 
segmentation  problem  as  0  — *  oo.  Some  notation  is  required  so  that  we  may  define 
what  we  mean  by  “convergence  of  a  set  of  edges”. 

4.0.2  The  Hausdorff  Metric 

For  4  C  3?",  the  e-neighborhood  of  A  will  be  denoted  by  [A]€  and  is  defined  by 

[A\e  =  {x  £  3?"  :  inf  ||®  -  t/||  <  e} 

ye  A 

where  ||  •  ||  denotes  the  Euclidean  norm.  In  the  terminology  of  mathematical  mor¬ 
phology  [35],  [A]e  is  the  dilation  of  A  with  the  open  ball  of  radius  e.  The  notion 
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of  distance  between  boundaries  which  we  will  use  is  the  Hausdorff  metric.  Denoted 
</#■(•,•),  the  Hausdorff  metric  is  evaluated  by 

dH{Ax,A2)  =  inf{e  :  Ai  C  [A2}f  and  A2  C 

Elementary  considerations  show  that  dji( *,  •)  is  in  fact  a  metric  on  the  space  of  all 
non-empty  compact  subsets  of  5?n. 

The  results  of  this  section  assert  that  if  the  image  is  ideal  i.e.  a  piecewise  smooth  or 
piecewise  constant  function  depending  on  whether  E  or  Eo  is  being  considered,  then 
the  optimal  boundaries  T  converge  to  the  discontinuity  set  of  image  with  respect  to  the 
Hausdorff  metric.  Furthermore  the  convergence  still  holds  if  the  image  is  corrupted 
by  smearing  and  additive  noise  provided  the  smearing  effect  and  the  magnitude  of  the 
noise  decay  sufficiently  quickly  as  f3  tends  to  infinity.  We  treat  the  piecewise  constant 
case  (i.e.  minimizing  E0)  and  piecewise  smooth  case  (i.e.  minimizing  E)  separately. 

4.1  Problem  Formulation 

We  will  be  examining  minimizers  of  E  and  E0  the  existence  of  which  is  asserted  by  the 
existence  theorems  (hence  by  E  we  mean  the  weak  form).  Solutions  are  determined 
by  T  (i.e.  for  a  fixed  T  the  optimal  /  if  unique)  and  we  will  often  refer  to  the  solution 
T  meaning  the  pair  f,T.  Also,  we  will  be  varying  the  parameter  f3  and  will  use  Tg  to 
indicate  an  optimal  solution  for  a  particular  value  of  (3. 

The  proofs  require  that  we  make  certain  assumptions  on  the  data  g.  The  limit 
theorem  has  been  proved  in  greater  generality  than  stated  in  this  paper  but  we  choose 
to  avoid  some  of  the  more  intricate  mathematics.  The  piecewise  constant  and  the 
piecewise  smooth  case  need  to  be  treated  in  part  separately.  The  piecewise  constants 
case  is  described  first. 

4.1.1  The  Piecewise  Constant  Case 

The  case  we  are  interested  in  is  one  in  which  the  image  is  a  corrupted  version  of  a 
piecewise  constant  L°°  function  g.  We  will  define  a  set  which  we  interpret  as  the 
natural  candidate  for  a  set  of  boundaries  in  the  image;  this  will  be  the  discontinuities 
of  the  image. 
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Suppose  that  fi  can  be  decomposed  into  a  countable  number  of  disjoint  sets  Aj 
having  piecewise  Lipschitz  boundaries  (say)  such  that  on  each  Aj,  g  is  constant.  We 
define  the  “boundary”  Bg  to  be  Cl  D  Uj  dAj.  We  assume  length(Ss)  <  oo  and  that 
if  length  ( dAi  D  dAj  >  0)  then  g{A, )  ^  g(Aj),  i.e.  the  boundary  should  be  “visible” 
from  g. 

4.1.2  The  Piecewise  Smooth  Case 

To  define  the  edges  in  the  piecewise  smooth  case  assume  g  £  L°°(Cl).  The  Lebesgue 
points  of  u,  i.e., 

{x  :  3z  :  lim  p~n  /  I u  —  z\dx  =  0} 

P-o+  r  JBp{x )  J 

are  the  points  of  approximate  continuity  g.  The  set  of  edges,  which  we  denote  by 
Bg  is  simply  the  complement  of  the  Lebesgue  points  of  u  in  Cl,  and  for  convenience 
we  assume  it  is  relatively  closed,  (a  weaker  sufficient  assumption  is  to  assume  that 

n%\Bg  =  o). 

The  following  summarizes  our  assumptions  in  both  cases. 

Assumption  1:  g  £  L°°(Cl ),  fn ^  |V<)|2  +  Hl(Bg)  <  oo  and  Bg  has  no  isolated 
points  i.e.  if  x  G  Bg  then  Vp  >  0,  %1( Bg  fl  Bp(x ))  >  0. 

Assumption  2:  If  A  C  Cl  is  an  open  set  satisfying  dist(A,  Bg)  >  0  then  there  exists 
an  L  <  oo  such  that  if  x  and  y  are  the  end  points  of  a  line  segment  lying  in 
A  then  then  |5(®)  —  £(y)|  <  L\x  —  y |.  We  refer  to  L  as  the  Lipschitz  constant 
associated  with  A. 

Essentially  we  have  assumed  that  g  £  C'0,1(f2\[£!ff]e)  for  any  e  >  0. 

4.1.3  The  Noise  Model 

In  this  section  we  describe  the  restrictions  on  the  relation  between  g  and  g  we  need 
to  make  in  order  for  the  limit  theorems  to  go  through.  A  succinct  statement  of  the 
assumptions  can  be  made  by  defining  a  parametrized  class  of  images  T {(3).  The 
following  are  our  assumptions  on  this  class. 

lim  sup  j3  f  [g  —  g)2  —  0, 

^°°s€  T(/3) 


(2) 
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and, 

Ve>0,  lim  sup  ||(^  -  ^)(1  -  X[»,].)l|oo  =  0  (3) 

^°°y6T(i9) 

Under  some  mild  additional  assumptions  we  can  convert  this  into  a  model  allowing 
smearing  and  bounded  additive  noise.  The  main  reason  for  allowing  smearing  is  not 
to  require  the  image  to  have  actual  jumps.  To  model  smearing  define  Sr  as  the  class 
of  maps  taking  L°°( fl)  to  having  the  property  that  the  value  of  the  image 

function  at  a  point  x  E  fi  lies  within  the  range  of  essential  values  that  the  argument 
function  takes  in  a  ball  of  radius  r  around  x.  This  models  in  a  quite  general  way 
smearing  of  the  image  and  hence  distortion  of  the  boundaries.  More  formally  $  €  Sr 
of  and  only  if  $  has  the  property 

$(ff)(x)  €  [ess  inf  glsr(x),  ess  sup 

An  example  of  such  a  $  would  be  a  smoothing  operator  defined  using  a  mollifier 
with  support  lying  inside  the  ball  of  radius  r,  but  nonlinear  perturbations  are  also 
allowed.  To  admit  this  model  of  smearing  it  is  convenient  to  make  the  following  mild 
assumption, 

There  is  a  constant  Cf,  <  oo  such  that  | [Bg)r  fl  fi|  <  c&r. 

This  assumption  is  automatically  satisfied  for  a  large  class  of  sets  containing  all  closed 
sets  having  finite  TLl  measure  and  finitely  many  connected  components.  This  is  a 
consequence  of  the  following  result  from  the  theory  of  Minkowski  content  [12], 

Proposition  4  [12]  Let  V  be  a  continuum  in  9?2  with  Til(T)  <  oo  then 

lim  Si  =  H\T). 

e— *o+  2e  v 

A  simplifying  assumption  for  the  piecewise  smooth  case  is  that  the  Lipshitz  con¬ 
stants  referred  to  in  Assumption  2  can  be  uniformly  bounded  i.e.  L  does  not  depend 
on  A.  (This  is  slightly  more  than  we  require  but  it  simplifies  the  statements  to  follow.) 
Now,  for  either  case  let  g  have  a  representation  of  the  form, 


g  =  $(g)  +  tiw 


(4) 
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for  some  $  £  Sr  and  w  £  L°°  with  ||w||oo  <  1  and  d  a  real  scalar.  Further,  assume 
that  there  are  functions  hT  :  (0,oo)  — >  [0,  oo)  and  h#  :  (0,oo)  — >  [0,  oo)  satisfying 

lim  /3hr(f3)  —  0 

f}—*oo 

lim  /35fi#(f3)  =  0. 

(3—*  oo 

Define  T (/3)  to  be  those  functions  g  which  can  be  written  in  the  form  4  for  some  4>r,w 
and  d  with  r  <  hr{(3 )  and  d  <  h$(f3).  It  now  follows  that  with  this  definition  of  T(/?) 
the  assumptions  2  and  3  are  satisfied  since 

0  f  (9  -  9?  <  P\\B*\r\  Ik  -  £||oo  +  0(LK(0)  +  t?)2|D|. 

4.2  Statement  of  the  Limit  Theorems 

Given  that  assumptions  1  and  2  are  satisfied,  the  following  holds, 

Theorem  5  As  (3  — >  oo  converges  to  Bg  with  respect  to  the  Hausdorff  metric, 

and  H1{ I^)  — +  T(}( Bg ).  Furthermore  —  g)  converges  to  0  in  Z-2(fi). 

We  mean  by  this  that  for  any  e  >  0  there  exists  /?'  >  oo  such  that  if  /3  >  (3' 
and  is  a  minimizer  of  E  (or  E0  as  the  case  may  be),  for  some  g  £  T (/?),  then 
dn(Tf},Bg)  <  e  and  —  'H1(0ff)|  <  e.  Note  that  this  shows  the  variational 

formulation  to  be  asymptotically  faithful  to  the  piecewise  smooth  assumption.  In  the 
limit  as  (3  — *  oo  the  geometry  of  the  boundaries  becomes  essentially  unrestricted. 
It  should  also  be  noted  that  in  the  limit  (3  — »  oo  any  discontinuity  in  the  image  will 
be  detected,  i.e.  edges  on  all  scales  are  recovered.  The  convergence  depends  strongly 
on  removal  or  rescaling  of  the  noise  (the  rates  for  which  are  tight).  To  convert  this" 
into  an  algorithm  we  need  to  re-interpret  the  rescaling  of  the  noise. 

5  A  Scaling  Algorithm 

It  was  pointed  out  in  Section  1  that  in  general  there  is  a  trade  off  between  the  accuracy 
of  localization  of  boundaries  found  by  the  variational  method  and  the  total  quantity 
of  boundary  admitted  into  the  solution.  (This  appears  to  be  true  for  most  edge 
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detection  techniques.)  Suppose  the  goal  of  the  segmentation  was  to  recover  objects 
only  above  a  certain  scale.  Consider  Figure  2  for  example  (assume  the  domain  is  very 
large),  if  one  were  trying  to  find  objects  on  the  scale  of  the  larger  square  and  not 
those  on  the  scale  of  the  smaller  square  by  minimizing  Eo  with  appropriately  chosen 
parameters,  then  it  is  necessary  to  incur  an  error  at  the  corners  of  at  least  (a/2  —  l)h 
as  illustrated  in  Figure  2. 

Now,  the  limit  theorem  discussed  in  the  preceding  section  state  that  as  (3  tends  to 
infinity  the  boundaries  which  are  found  by  solving  the  variational  problem  converge  to 
the  correct  ones  (i.e.  the  discontinuity  set  of  the  image)  with  respect  to  the  Hausdorff 
metric.  As  such,  these  theorem  do  not  provide  us  with  any  means  of  circumventing 
the  scale/ accuracy  trade-off  because  they  state  that  the  limit  of  the  minimizing  T 
includes  all  of  the  discontinuity  set  of  the  image.  We  ask  whether  it  is  possible  to 
take  the  limits  required  by  the  limit  theorem  while  avoiding  the  attendant  problem 
of  introducing  more  and  more  boundary  into  the  solution. 

In  response  to  this  question  we  sketch  an  algorithm  which  requires  within  it  two 
key  operations  or  procedures: 

Pi:  The  Minimization  of  E  (or  E0)  to  produce  f  and  T  with  the  parameters  a  and 
(3  as  input  variables 

P2:  The  updating  or  altering  of  the  image  by  smoothing  outside  some  neighborhood 
of  T  and  updating  the  parameters  which  provide  the  data  for  PI  for  resolution 
at  smaller  scale. 

The  interaction  between  these  two  procedures  is  illustrated  in  Figure  3.  An  execution 
of  the  algorithm  begins  by  minimizing  E  with  the  function  g  set  to  the  original  data" 
and  the  parameters  a  and  /3  chosen  to  provide  edge  detection  on  the  desired  scale. 
Once  minimal  /  and  T  have  been  determined  they  are  used  to  alter  the  function  g. 
The  new  g  is  formed  by  “smoothing”  the  original  g  outside  of  some  neighborhood  of 
T  while  leaving  it  unchanged  inside  the  neighborhood.  Since  /  is  a  piecewise  smooth 
approximation  to  g  which  respects  the  edges  T  a  simple  smoothing  technique  might 
be  to  take  a  convex  combination  of  /  and  g  (this  is  what  was  done  in  the  simulations). 
Simultaneously,  the  parameters  j3  and  a  are  assigned  new  values  such  as  would  be 
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9  =  0 


Figure  2:  Segmentation  of  Two  Squares  with  a  >  ^  >  b 


end 


Updated  g,0,a 


Figure  3:  A  Schematic  for  Scale  Independent  Segmentation 

used  with  the  original  data  if  the  edge  detection  were  desired  on  a  finer  scale.  We  then 
re-solve  the  problem  of  minimizing  E,  the  difference  this  time  being  that  we  use  the 
updated  image  and  parameter  values.  The  hope  is  that  we  should  detect  essentially" 
the  same  edges  as  before  only  now  with  finer  resolution.  This  procedure  can  then  be 
iterated  on  until  sufficient  accuracy  is  attained. 

As  stated  the  idea  of  the  algorithm  is  very  general  and  there  remains  within  each 
procedure  considerable  flexibility.  How  one  does  the  smoothing  or  chooses  neighbor¬ 
hoods  is  unspecified;  also,  we  have  not  stated  how  we  will  find  a  minimizer  of  the 
variational  problem.  Of  course,  in  order  for  the  algorithm  to  work  the  various  pa- 
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rameters  and  operations  must  be  coordinated  properly.  It  is  at  this  point  that  the 
limit  theorem  become  useful.  The  limit  theorem  effectively  governs  the  evolution  of 
the  parameters. 

If  we  remove  the  smoothing  of  the  image  from  procedure  P2  then  the  algorithm 
would  constitute  an  explicit  taking  of  the  same  limit  as  taken  in  the  limit  theorem. 
The  deficiency  with  the  limit  theorem  with  regard  to  their  direct  application  in  edge 
detection,  i.e.  the  reason  why  this  will  not  be  effective  is  the  fact  that  the  limit 
theorem  predicts  the  convergence  of  the  solution  T  to  the  entire  discontinuity  set  of 
the  image.  If  the  image  was  noisy  this  could  effectively  result  in  boundaries  being 
put  essentially  everywhere.  The  smoothing  of  the  image  plays  a  role  in  the  algorithm 
analogous  to  that  played  by  the  rescaling  of  the  noise  in  the  limit  theorem.  Small 
features  and  noise  are  smoothed  out,  but  at  the  same  time  the  detail  needed  for 
accurate  localization  of  the  large  scale  boundaries  is  retained. 

Recall  that  in  the  limit  theorem  the  amount  of  noise  which  can  be  allowed  while 
retaining  convergence  of  the  boundaries  has  been  quantified.  In  particular  we  con¬ 
sidered  sequences  fin,gn  such  that  {grn}  converges  to  the  ideal  image  accord¬ 
ing  to  lim^oo (3nSn(gn  -  goo)2  =  0,  where  fi  is  the  domain  of  the  image,  and 
limn-.oo  -  X[5JCO].)(fifn  -  0oo)||oo  =  0  for  any  e  >  0  (where  <Sfloo  is  the  sup¬ 

port  set  of  the  discontinuities  of  g^).  Now,  let  $r  represent  an  arbitrary  smearing 
operator  such  that  the  value  of  the  result  at  a  point  x  £  fl  lies  within  the  range  taken 
by  argument  in  Br(x )  and  let  w  represent  an  arbitrary  function  in  the  unit  ball  of 
Too(n).  We  argued  in  Section  4,  under  some  mild  regularity  assumptions  on  g^,  that 
the  convergence  conditions  were  satisfied  if  we  could  represent  the  gn  by, 


gn  =  $rn(<7oo)  + 

with  rn  being  a  sequence  of  constants  satisfying 

lim  j3nrn  -  0 


(5) 


and  t?n  being  a  sequence  of  constants  satisfying 
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The  idea  of  the  algorithm  is  to  produce  by  generating  the  gn. 

The  algorithm  produces  sequences  {/„},  {Tn},  {p„},  {fln}  such  that  /„  and  Tn 
are  found  by  minimizing  E(f3n,gn )  (or  E0),  and  it  also  produces  a  scalar  sequence 
K)  where  un  denotes  the  size  of  the  neighborhood  within  which  the  smoothing  is 
suppressed  at  stage  n.  The  algorithm  is  initialized  by  setting  g0  =  g  where  g  is  the 
original  data  and  by  choosing  u0  .  The  quantities,  gn,fln  and  un  are  defined  according 
to  the  following  schedule, 

9n+i  =  gn  T  hn(®)  e(/n  <7n) 

/?„+i  =  (1  -  e)~2/?„ 

/,  \2  /^0 

Un+ 1  =  (1  -€)«„  =  -J-Uo 

Pn 

The  function  hn(x )  controls  the  spatial  dependence  of  the  smoothing  which  is  effected 
by  partially  replacing  gn  with  fn.  For  simplicity,  and  to  be  consistent  with  our 
simulations  we  consider  setting  hn  equal  to  1  —  X[rn]«n-  The  parameter  u0  represents 
an  estimate  of  the  error  in  the  initial  boundary  locations.  For  the  simulations  tt0  has 
been  selected  heuristically  (see  also  Section  6).  The  formalism  just  presented  is  a 
discrete  one  i.e.  a  discrete  sequence  of  images  and  parameters  is  produced.  One  can 
in  principle  also  vary  the  parameter  n  continuously  and  represent  the  algorithm  as 
differential  equations.  In  this  context  we  can  say  that  e  represents  the  step  size  of  the 
algorithm. 

One  can  argue  heuristically  why  the  set  of  boundaries  should  remain  essentially 
unchanged  throughout  the  iterations  of  the  algorithm  for  reasonable  small  values  of 
uo .  For  simplicity,  consider  what  would  happen  if  we  set  K{x)  —  1  for  all  n,  i.e. 
u0  =  0.  It  is  not  very  difficult  to  check  that  the  solution  /0,  T0  would  be  a  local" 
minimum  for  the  functional  E(gi,(3i)  i.e.  if  we  consider  small  local  variations  in  the 
boundary  we  find  that  the  original  locations  are  optimal.  We  conjecture  that  f0  is  in 
fact  a  global  minimum  and  this  can  easily  be  seen  for  certain  special  cases  such  as 
the  image  of  a  square.  Because  the  original  solution  is  in  some  sense  being  reinforced 
by  the  feedback  we  expect  that  it  becomes  a  ‘deeper’  minima  then  previously.  This 
then  lends  a  certain  robustness  to  the  algorithm. 

There  are  several  observations  which  should  be  make  concerning  the  proposed 
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algorithm.  First,  since  the  algorithm  requires  minimizing  E  many  times  the  com¬ 
putational  load  will  obviously  be  higher  than  simply  minimizing  E.  The  algorithm 
has  been  proposed  to  demonstrate  the  possibility  of  overcoming  the  scale/accuracy 
tradeoff  inherent  in  the  original  model.  It  has  been  structured  to  parallel  as  much 
as  possible  the  asymptotic  theorem.  There  are  several  improvements  and  speed-ups 
that  can  be  considered.  Many  approaches  to  minimizing  E  would  admit  at  each 
stage  the  possibility  of  using  solution  from  one  step  as  an  initial  condition  for  the 
next,  thus  reducing  time  required  to  find  the  new  solution.  Another  possibility  would 
be  to  modify  the  parameter  /3  locally  within  the  image  domain  and  to  dispense  with 
the  smoothing  step  entirely.  This  could  also  substantially  save  on  computation.  A 
second  observation  concerns  the  magnitude  of  e.  Because  of  the  manner  in  which  we 
spatially  control  the  smoothing  of  the  data  we  cannot  afford  to  make  e  large.  The 
sharp  cutoff  of  the  region  in  which  smooth  (by  convex  combination)  could  create 
small  discontinuities  in  the  data  of  order  e.  It  is  important  therefore  in  our  particular 
implementation  that  e  not  be  too  large.  Having  smaller  e  on  the  other  hand  forces 
more  iterations  of  the  sequence  P1-P2  in  order  to  reach  the  desired  resolution.  A 
simple  improvement  might  be  force  the  feedback  to  have  vary  smoothly  spatially. 

Although  these  ideas  and  others  could  conceivably  improve  the  computational 
viability  of  the  proposed  algorithm  substantially  we  have  chosen  to  defer  these  re¬ 
finements  to  a  later  time.  Our  goal  here  is  to  build  on  the  asymptotic  results  for  the 
variational  formulation  in  the  simplist  way  possible  demonstrating  the  potential  for 
improving  on  the  formulation  via  certain  dynamics  in  scale. 

5.1  An  Example 

To  illustrate  the  algorithm  we  consider  minimizing  E0  when  the  image  is  that  of  a 
light  square  on  a  dark  background  such  as  in  Figure  4.  Let  S  be  the  support  set  of  the 
square  and  define  C  —  fl\S.  For  simplicity  assume  that  U  is  a  set  much  larger  then 
S  such  that  dist(5fi,  chS)  a.  In  this  case  the  P  which  minimizes  Eq  is  either  the 
emptyset  or  the  contour  indicated  in  Figure  4  which  is  symmetric  about  the  center  of 
5  and  consists  of  4  straight  line  segments  of  length  2(a  —  r)  one  centered  on  each  side 
of  S  and  the  4  components  one  obtains  from  intersecting  a  circle  of  radius  r  with  the 
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Figure  4:  Segmentation  of  a  Square 


four  quadrants  in  5i2.  Obviously  r  <  a.  We  will  denote  such  a  T  by  Fr .  The  following 
two  expressions  are  easily  derived. 


E0(n 

£o(0) 


(4  —  tc)t2\C\ 
\C\  -f  (4  —  7r  )r2 

gJ^\C\_ 

PU2  +  \C\ 


+  8(a  —  r)  +  2ir  r 


For  simplicity  consider  the  case  |(7|  =  oo.  By  minimizing  £'0(rr)  over  r  one  obtains 
r  =  /3~1.  Comparing  the  two  possible  solutions  one  discovers  that  there  is  a  threshold 
t  =  such  that  r  =  0  is  optimal  when  a/3  <  t  and  T  =  is  optimal  when 
a/3  >  t.  From  this  we  conclude  that  the  maximum  error,  i.e.  the  maximum  possible 
value  of, 

e  =  dH(dS,T)  =  {V2  -  l)r, 

which  can  occur  when  T  is  not  the  empty  set  is  In  general  e  is  proportional 

to  /?_1  thus  as  long  as  u0  >  (y/2  —  l)/3„ 1  then  the  Tn  produced  by  the  algorithm  will" 
either  converge  to  dS  in  Hausdorff  metric  as  n  ^  oo  or  will  equal  the  empty  set  for 
all  n.  When  a/3  =  t  then  either  Tn  =  0  for  all  n  or  for  some  n  rn  =  r^"1  and  then 
rm  =  for  all  m  >  n.  If  0  <  u0  <  (V2  -  and  a/3  >  t  then  the  Tn  will  not 

converge  to  dS  however  dff(Tn,dS)  will  be  decreasing. 
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6  A  T-Convergent  Approximation 

A  idea  which  plays  an  important  role  in  our  study  and  implementation  of  the  vari¬ 
ational  approach  to  the  edge  detection  problem  is  the  notion  of  T -convergence  due 
to  E.  De  Giorgi.  The  same  concept  was  developed  independently  in  France  under 
the  name  epi-convergence  by  H.  Attouch  [6].  It  concerns  variational  convergence,  i.e. 
the  approximation  of  one  variational  problem  by  another.  In  this  section  we  pro¬ 
vide  a  definition  of  T-convergence,  state  some  of  its  basic  properties  and  provide  an 
application  to  our  problem. 

Let  (S,  d)  be  a  separable  metric  space  and  let  Fn  :  S  — >  [0,+oo]  be  functions.  We 
say  Fn  T{S)  —  converges  to  F  :  S  — >  [0,  +oo]  if  the  following  two  conditions  hold  for 
all  x  £  S, 


Vxn  — >  x  liminf  Fn(xn)  >  Fix) 

and  — >  x  liminf  Fn(xn)  <  Fix) 

The  limit  F  when  it  exists  is  unique  and  lower-semicontinuous.  The  following  propo¬ 
sition  characterizes  the  main  properties  of  T-convergence. 

Proposition  6  (see  [5]  for  example)  Assume  that  Fn  T(S)-converges  to  F.  Then, 
the  following  statements  hold. 

(i)  Fn  +  G  T(S) -converges  to  F  +  G  for  every  continuous  function  G  :  S  —*  3?. 

(ii)  Let  tn  l  0.  Then,  every  cluster  point  of  the  sequence  of  sets 

{x  £  S  :  Fn(x)  <  inf  Fn  +  tn} 
s 

minimizes  F. 

(Hi)  Assume  that  the  functions  Fn  are  lower  semicontinuous  and  for  every  t  £  [0,oo) 
and  there  exists  a  compact  set  Kt  C  S  with 

{x  £  S  :  Fn(x)  <  t}  C  Kt  Vn  £  M 

Then,  the  functions  Fn  have  minimizers  in  S,  and  any  sequence  xn  of  minimizers  of 
Fn  admits  subsequences  converging  to  some  minimizer  F . 
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A  significant  contribution  of  T  convergence  to  our  problem  is  that  via  a  T-convergent 
approximation  one  can  represent  the  boundaries  by  a  function  defined  on  the  same 
domain  as  the  image.  The  functional  given  below,  for  example  is  a  modified  version 
of  an  approximation  scheme  due  to  Ambrosio  and  Tortorelli  [5]. 

En(f,v)  =  0  f(f-g)2+  /  e— 2|V/|2  4 -a  f  (e— 2|Vu|2  + 

JO  JO  JO  lb 

In  this  approximation  one  replaces  the  set  T  C  0  with  a  function  v  :  fl  — ►  [0,oo]. 
The  location  of  boundaries  is  in  general  given  by  e~nv2  ~  0. 

It  was  proved  in  [5]  that  the  sequence  of  functionals  very  similar  to  {!?"}  (with 
(1—  v2)n  replacing  e~n”2),  Y -converges  to  E.  This  proof  can  be  modified  and  extended 
to  include  the  functional  given  above.  The  metric  space  in  this  case  is  a  product  of 
function  spaces  elements  of  which  are  pairs  (/,  v). 

The  functionals  En  can  be  discretized  by  finite  elements  thus  obtaining  a  sim¬ 
ple  representation  of  the  edges  suitable  for  computation.  Our  simulations  employ 
a  gradient  descent  to  find  local  minimizers  of  En.  This  approach  closely  resembles 
the  an-isotropic  filtering  approach  due  to  Perona  and  Malik  [30],  In  our  case  how¬ 
ever  we  obtain  an  explicit  representation  of  the  boundary  which  would  be  useful  for 
subsequent  processing. 

The  function  v  appearing  in  the  T-convergent  approximation  is  such  that  1  — e-"”2 
has  the  appearance  of  a  smoothed  neighborhood  of  the  boundaries.  The  boundaries 
themselves  can  be  identified  with  those  locations  where  e~nv 2  ~  0.  It  can  be  shown 
(see  [31])  that  by  thresholding  e_n”2  at  a  level  t  one  obtains  a  set  which  can  be 
interpreted  as  a  neighborhood  of  T  of  size  ut  where, 

(7) 

2  J-iny/i  r  v  ’ 

This  means  that  we  can  define  a  neighborhood  of  T  to  be  { x  :  e~nv2  <  t}  and  that  this 
should  approximate  [T]Ut  where  ut  ~  £  /“invt  ^ r •  Thus  sublevel  sets  of  e~nv3 

can  provide  an  approximation  to  the  neighborhoods  of  T  required  by  the  algorithm 
outlined  in  the  previous  section.  For  the  simulations  presented  in  this  paper  t  has 
been  set  to  |.  The  equation  7  then  yields  the  relation  ai  ~ 
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7  Computation 

In  this  section  we  describe  how  the  ideas  presented  in  the  Section  5  are  refined  in  the 
piecewise  smooth  case  by  introducing  the  T-convergent  sequence  of  approximations 
to  E  as  a  means  to  generate  both  the  boundary  locations  and  their  neighborhoods. 

The  procedure  PI  of  the  algorithm  will  now  be  implemented  by  the  minimization 
of  the  following  functional  over  /  and  v, 

E(f,v,g,J3,a,n)  =  l3  f  (f  -  g)2  +  [  (1  -  v2)n\Vf\2  +  a  [  ((1  -  u2)"|Vu|2  +  ^). 

J  n  Jn  J  n  16 

The  procedure  P2  will  be  implemented  by  altering  the  other  variables.  One  can  write 

down  the  Euler- Lagrange  equations  for  v  and  /  associated  with  this  functional  and 

then  the  parabolic  equations  which  would  be  associated  with  a  descent  algorithm. 

For  E  as  above  we  obtain, 

%  =  c/[v.(e-",V/)  -0V-,)] 

^  =  c.  [«V  ■  (e-"”3 Vu)  +  ti(|V/|3  +  a|Vi>|2)e-nl,3v  —  -^-v 

ut  ID 

with  Neumann  boundary  conditions.  The  parameters  Cf  and  c„  are  arbitrary  positive 
constants  which  controls  the  relative  rate  of  descent.  These  equations  resemble  the 
non-linear  filtering  scheme  of  Perona  and  Malik  [30].  The  differences  are  worth  noting. 
The  equations  presented  here  have  a  term  dependent  on  g;  unlike  Perona  and  Malik 
we  do  not  necessarily  converge  to  a  piecewise  constant  function.  Also  the  control  of 
the  conductivity  associated  with  the  diffusion  of  the  image  is  effected  by  the  function 
v  rather  than  by  an  explicit  function  of  the  magnitude  of  gradient  of  /.  The  function 
v  is  governed  by  another  partial  differential  equation  whose  driving  term  is  related  to " 
the  gradient  of  /. 

Since  the  functional  is  not  convex  in  v  we  do  not  expect  to  always  reach  a  global 
minimum  by  a  descent  method.  Also  the  dependence  of  the  solution  on  the  initial 
conditions,  and  the  constants  c/  and  cv  is  significant.  For  example  if  c/  is  of  much 
smaller  than  cv  and  we  initialize  /  by  setting  it  equal  to  g  then  the  system  of  equations 
places  emphasis  on  the  height  of  edges.  The  evolution  of  v  is  initially  governed 
essentially  by  |Vy|2  and  thus  will  tend  to  place  boundaries  at  edges  in  the  image 
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largely  ignoring  the  size  of  the  feature  of  which  the  edge  is  a  boundary.  Conversely 
if  c„  is  much  smaller  than  c/  or  if  /  is  initialized  by  a  smoothed  version  of  g  then 
the  geometry  of  the  features  will  play  an  important  role  since  a  smaller  features  will 
produce  a  smaller  gradients  in  /  even  for  the  same  height  of  the  discontinuity.  Thus 
boundaries  will  be  more  likely  to  appear  at  the  edges  of  larger  objects,  everything  else 
being  equal.  Consider  for  example  the  image  of  the  two  squares  Figure  2.  Suppose 
we  initialize  the  descent  equations  with  /  set  to  the  solution  of  A f  —  /3(f  —  g )  and  /3 
satisfies  -^=  ~  b.  The  smaller  square  (with  sides  of  length  b )  will  have  a  smaller  effect 
on  the  initial  /.  That  is,  the  gradient  of  the  initial  /  will  be  smaller  near  the  edges  of 
the  smaller  square  than  near  those  of  the  larger  square.  Consequently  if  a  is  chosen 
appropriately  the  equations  above  will  have  a  greater  tendency  to  increase  v  i.e.  to 
place  an  edge  near  the  edges  of  the  larger  square  than  near  those  of  the  smaller. 

Whatever  choice  is  made  concerning  the  selection  of  the  various  parameters  asso¬ 
ciated  with  the  computation  it  is  important  that  they  be  kept  consistent  throughout 
the  iterations  of  the  algorithm.  The  intent  of  the  algorithm  is  to  refine  the  boundaries 
found  in  the  early  stages,  not  to  radically  change  them.  To  achieve  this  a  consistent 
computational  approach  is  necessary.  We  mentioned  in  Section  5.1.1  that  the  feed¬ 
back  will  have  the  effect  of  reinforcing  the  solution  found  during  the  earlier  stages  of 
the  algorithm,  tending  to  make  that  solution  a  ‘deeper’  minima  than  initially.  The 
same  argument  holds  for  the  local  minima  which  will  be  found  by  a  computational 
procedure  such  as  we  have  described.  As  long  as  the  computation  remains  consistent 
from  iteration  to  iteration  then  the  feedback  should  make  the  algorithm  more  robust 
in  the  sense  that  it  encourages  the  finding  of  essentially  the  same  solution. 

7.1  Discretization 

In  this  section  a  particular  discretization  the  functionals  En  is  presented.  This  for¬ 
mulation  may  have  value  in  that  it  lends  itself  to  possibility  of  an  analog  hardware 
VLSI  implementation  such  as  considered  in  Harris  et.  al.  [21]. 

For  the  simulations  presented  in  this  paper  /,  g  and  v  were  discretized  by  finite 
elements  in  a  manner  described  below.  Discrete  versions  of  /  and  g  are  defined  on 
a  square  lattice  with  lattice  constant  8  while  the  discrete  version  of  v  is  defined  on 
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one  twice  as  dense.  This  is  not  necessary  but  it  facilitates  the  implementation  of  the 
discrete  problem.  Figure  5  indicates  the  assignment  of  lattice  values.  For  convenience 
the  variables  are  labeled  as  in  standard  matrix  notation;  thus  fij  denotes  the  variable 
associated  with  /  at  the  lattice  location  row  i,  column  j.  To  keep  the  notation  for  the 
function  v  consistent  with  the  lattice  used  for  /  and  g  the  variables  associated  with 
v  are  partitioned  into  two  sets,  vh  and  vv,  corresponding  in  some  sense  to  horizontal 
and  vertical  edge  elements.  The  assignments  are  as  in  Figure  5.  A  suitable  discrete 
version  of  E  in  terms  of  these  variables  is  the  following, 


hj 

£(/«  -  +  (hi  ~ 


+il§‘" 


E  ( 


vv, 


hj 


t-WP+E  E  (vhi-w ,, 


(  E  e  nvvlj (vvij  -  vh{iji)  +  e 

2  16  EK  +  t >hh) 


+a 
a  S2n 2 


where  J\fh(i,j )  is  the  set  of  indices  for  the  nearest  vertical  edge  element  neighbors  of 
vhij  and  similarly  Afv(i,j)  is  the  set  of  indices  for  the  nearest  horizontal  edge  element 
neighbors  of  Wij.  A  discrete  form  of  the  Euler-Lagrange  equations  for  this  system 
is  found  by  differentiating  the  expression  above  with  respect  to  the  various  elements. 
Associated  with  this  one  obtains  a  gradient  descent  equations  of  the  following  form, 


ft+ 1  ft  _ 

~cs 

»viy  -  ™‘j  = 

cv 

d 


dfij 

d 


E 

-E 


d 


Vdvhij 


E 


(8)" 

(9) 


where  the  variables  c/  and  cv  control  the  stepsizes  of  the  algorithm  and  t  denotes 
‘time’  or  steps  in  the  algorithm.  For  our  simulations  we  updated  /  by  using  the 
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Figure  5:  Lattice  Variables  and  Finite  Elements 


assignment, 

cf  =  2  (82(3  +  e-n<i  +  e~nvvi +  e"n,,fcu  +  e~nvhi’>~1 )  _1 

in  keeping  with  standard  relaxation  algorithms.  The  constant  cv  should  be  scaled 
with  n.  The  details  can  be  found  in  [31];  c„  should  be  scaled  as  ra~1-5. 

In  general  some  upper  limit  on  n  must  be  imposed  for  a  given  lattice  spacing. 
Consider  the  behavior  of  v  in  the  T-convergent  approximation  for  large  n.  For  each 
point  i  in  the  array  there  is  a  term  in  the  cost  proportional  to  n2vf.  Now,  as  n 
becomes  large  it  is  necessary  for  the  cost  to  remain  bounded  that  V{  decrease  like  V 
However  limn_too(l-^)n  =  1  for  any  K  <  oo,  i.e.  as  n  tends  to  infinity  all  boundaries 
will  be  removed.  There  is  a  secondary  positive  feedback  effect  which  aggravates  this 
problem.  An  increase  in  e~nv2  results  in  an  increase  in  the  smoothness  of  /  i.e.  a 
reduction  in  |  V/|.  This  causes  yet  a  further  increase  in  e_n"3 .  Thus  it  is  apparent  that 
the  discretized  version  of  this  approximation  becomes  unreliable  for  large  n.  Another 
reason  for  keeping  n  small  is  that  if  the  wider  the  ‘support’  of  the  edges  in  terms 
of  the  lattice  constant  the  smaller  the  effect  of  the  discretization.  In  particular  the 
rotational  invariance  of  the  continuum  formulation  is  better  retained. 
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8  Simulation  Results 

We  have  simulated  the  algorithm  developed  in  this  Section  on  the  image  shown  in 
Figure  6.  The  size  of  the  image  is  230  x  216  pixels.  The  image  is  a  well  known 
one  but  the  version  used  is  an  unusally  noisy  one.  It  was  taken  from  a  bitmap  i.e. 
a  0\1  image  of  size  920  x  864  bits  and  each  4x4  block  was  mapped  onto  a  single 
pixel.  The  value  of  the  image  g  at  a  given  pixel  is  proportional  to  the  number  of 
l’s  in  the  associated  4x4  block  and  is  scaled  so  that  the  range  of  g  lies  within 
[0,2].  The  image  is  thus  quite  noisy.  Our  displays  are  also  bitmaps  where  we  have 
reversed  this  procedure.  Thus  our  resolution  is  essentially  4  bits  per  pixel  although 
the  computations  were  done  using  64  bit  floating  arithmetic.  This  image  is  a  rather 
problematic  one  for  edge  detection  because  many  of  the  edges  are  blurred  and  there 
are  regions  of  texture.  We  have  performed  the  simulation  for  several  scales.  For 
one  scale  which  we  denoted  ‘Scale  2’  we  have  sampled  the  functions,  g  and  e~nv2  at 
various  stages  of  the  algorithm.  What  is  worth  noticing  is  how  the  fine  detail  such 
as  sharp  corners  and  T-junctions  are  recovered  in  the  final  stages.  This  can  be  seen 
particularly  in  the  details  of  the  eyes.  Also,  as  predicted  the  global  properties  of  the 
solution  remain  essentially  unchanged.  Because  the  edges  themselves  are  blurred  and 
noisy  at  fine  scales  we  begin  to  observe  multiple  edges.  Even  though  0  is  increased  by 
a  factor  of  13  the  particular  boundaries  which  are  found  do  not  change  except  in  the 
fine  detail  of  the  localization.  (In  other  experiments  0  was  allowed  to  increase  over 
a  much  larger  range  and  similar  results  were  obtained.)  In  figure  9  we  display  e~n®J 
from  solutions  obtained  for  several  scales.  Notice  that  the  set  of  boundaries  found  is 
essentially  monotonic  in  scale. 
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8 

0.05 

Range  of  g 

[0.0, 2.0] 

Stepsize  c„ 

0.3 
an1. 5 

e 

0.05 

Table  1: 


General  Parameters  for  ‘Lenna’  Simulations 


‘Scale’ 

a 

Initial  j3 

Final  /? 

Initial  n 

Final  n 

1 

0.008 

2.0 

26.0 

3.0 

39.0 

2 

0.006 

3.0 

39.0 

3.0 

39.0 

3 

0.0045 

7.0 

91.0 

8.0 

104.0 

4 

0.003 

10.0 

130.0 

10.0 

130.0 

Table  2:  Parameters  for  Simulation  ‘Lenna:  All  Scales’ 


Sample  Number 

a 

P 

n 

1 

0.006 

3.0 

3.0 

2 

0.006 

5.0 

5.0 

3 

0.006 

14.0 

14.0 

4 

0.006 

39.0 

39.0 

Table  3:  Parameters  for  Simulation  ‘Lenna:  Scale  2’ 
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Figure  6:  “Lenna”  Data  Used  for  Simulations 
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Sample  1 


Sample  4 


Figure  8: 


Lenna  Scale  2:  Samples  of  Updated  g 


Scale  2 


Scale  4 


Figure  9:  Original  Image  with  Edges 
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